Online multiclass learning with ”bandit” feedback under a Passive-Aggressive approach
نویسندگان
چکیده
Abstract. This paper presents a new approach to online multi-class learning with bandit feedback. This algorithm, named PAB (Passive Aggressive in Bandit) is a variant of Online Passive-Aggressive Algorithm proposed by [2], the latter being an e↵ective framework for performing max-margin online learning. We analyze some of its operating principles, and show it to provide a good and scalable solution to the bandit classification problem, particularly in the case of a real-world dataset where it outperforms the best existing algorithms.
منابع مشابه
Boosting with Online Binary Learners for the Multiclass Bandit Problem
We consider the problem of online multiclass prediction in the bandit setting. Compared with the full-information setting, in which the learner can receive the true label as feedback after making each prediction, the bandit setting assumes that the learner can only know the correctness of the predicted label. Because the bandit setting is more restricted, it is difficult to design good bandit l...
متن کاملNew bounds on the price of bandit feedback for mistake-bounded online multiclass learning
This paper is about two generalizations of the mistake bound model to online multiclass classification. In the standard model, the learner receives the correct classification at the end of each round, and in the bandit model, the learner only finds out whether its prediction was correct or not. For a set F of multiclass classifiers, let optstd(F ) and optbandit(F ) be the optimal bounds for lea...
متن کاملConfusion-Based Online Learning and a Passive-Aggressive Scheme
This paper provides the first —to the best of our knowledge— analysis of online learning algorithms for multiclass problems when the confusion matrix is taken as a performance measure. The work builds upon recent and elegant results on noncommutative concentration inequalities, i.e. concentration inequalities that apply to matrices, and, more precisely, to matrix martingales. We do establish ge...
متن کاملEfficient Online Bandit Multiclass Learning with Õ(√T) Regret
We present an efficient second-order algorithm with Õ( 1 η √ T )1 regret for the bandit online multiclass problem. The regret bound holds simultaneously with respect to a family of loss functions parameterized by η, for a range of η restricted by the norm of the competitor. The family of loss functions ranges from hinge loss (η = 0) to squared hinge loss (η = 1). This provides a solution to the...
متن کاملEfficient Online Bandit Multiclass Learning with $\tilde{O}(\sqrt{T})$ Regret
We present an efficient second-order algorithm with Õ( 1 η √ T ) regret for the bandit online multiclass problem. The regret bound holds simultaneously with respect to a family of loss functions parameterized by η, for a range of η restricted by the norm of the competitor. The family of loss functions ranges from hinge loss (η = 0) to squared hinge loss (η = 1). This provides a solution to the ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015